Data structure of multimedia file format, encrypting method and device thereof, and decrypting method and device thereof

ABSTRACT

In a data structure of a multimedia file format, a movie box and a media data box are provided. In each box, a non-encrypted size field, a non-encrypted type field and box data field are provided. In box data of the movie box, information data regarding multimedia data is stored. The multimedia data is encrypted and stored in box data of the media data box. The information data is obtained by referring to the container in the movie box. This information data is held as encryption and encoding information data. By referring to the information data, a data unit of the encrypted multimedia data in the media data box is obtained, and the unit data is decrypted.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is based upon and claims the benefit of priorityfrom the prior Japanese Patent Application No. 2002-097757, filed Mar.29, 2002, the entire contents of which are incorporated herein byreference.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The present invention relates to a data structure of a multimediafile format, an encrypting method and an encrypting device thereof, anda decrypting method and a decrypting device thereof. More particularly,it relates to an encrypting method and an encrypting device of movingpicture files in a moving picture recorder and a reproducing deviceequipped with memory cards.

[0004] 2. Description of the Related Art

[0005] In recent years, a morphology of contents such as moving pictureshas transferred from analog data to digital data. The digitized contentscan be copied without any deterioration of quality. Accordingly, contentdata can be copied through a CD-R, a recordable DVD disk or a memorycard between users, alternatively by a file transfer technologyutilizing communication network such as Internet, for example, bysending the content data attached to E mail. Such a copy operation hasbecome rampant, which brings about problems of copyright in a contentbusiness world.

[0006] As a technique for protecting the copyright of the digitalcontents, there is a method for encrypting the content data. By thisencryption, illegal copy can be prevented. In the conventional case ofencrypting the content data, a method for sweepingly encrypting thecontent data en block from head to end has generally been employed.Therefore, only those who have rights to use the data content, i.e.,rights to decrypt the content data, can use the content data.

[0007] In the aforementioned conventional encrypting method, when thecontent data is encrypted en block from head to end, it is possible toprevent the illegal copy. However, since the content data is encrypteden block from head to end, it is not easy to access an optional positionof the content data. To access the optional position, even a code ofdata not targeted for accessing must be decrypted. In practice,therefore, there is a problem of a useless processing requirement. Thatis, conventionally, accessing to the optional position of the encryptedcontent data requires processing for sequentially decrypting codes fromthe head of the content, which is carried out until a desired contentposition is reached. Such processing has a problem of a long processingtime until data of the access position is obtained.

[0008] The processing of decrypting the codes until the desired contentposition is reached is necessary only for accessing a desired position,but not for actually using the content data. Thus, it can be said thatit is useless processing.

[0009] As a desired access position is more apart from the head of thefile, the aforementioned useless processing and the processing time areincreased. Since the increases of a processing load and the processingtime are accompanied by a power consumption increase, a portableequipment or the like using a battery has a problem of a reduction incontinuous use time.

[0010] The accessing to the optional position of the content data isnecessary for realizing, for example fast-forward reproduction, rewindreproduction, random access reproduction, and resume reproduction(function of resuming from where reproduction is stopped by a user) inreproduction of moving pictures.

BRIEF SUMMARY OF THE INVENTION

[0011] Objects of the present invention are to provide a data structureof a multimedia file format which enables efficient access to apredetermined position of content data, an encrypting method thereof,and a decrypting method thereof.

[0012] According to the present invention, there is provided a datastructure of a multimedia file format comprising:

[0013] a first box having first encrypted box data which stores a firstnon-encrypted size field to indicate a size of the first box by bytes, afirst non-encrypted type field to identify a type of the first box, andencrypted multimedia data; and

[0014] a second box having second encrypted box data which stores asecond non-encrypted size field to indicate a size of the second box bybytes, a second non-encrypted type field to identify a type of thesecond box, and encrypted information data regarding multimedia datastored in the second box data.

[0015] Furthermore, according to the present invention, there isprovided a method of encrypting a multimedia file having a file formatstructure comprising a first box having first box data which stores afirst size field to indicate a size of the first box by bytes, a firsttype field to identify a type of the first box, and multimedia data, and

[0016] a second box having second box data which stores a second sizefield to indicate a size of the second box by bytes, a second type fieldto identify a type of the second box, and information data regardingmultimedia data stored in the second box data,

[0017] the method comprising:

[0018] encrypting the multimedia data to be stored in the first box dataand storing the encrypted multimedia data in the first box data;

[0019] encrypting the information data to be stored in the second boxdata and storing the encrypted information data in the second box data;and

[0020] storing the first and second size fields and the first and secondtype fields in corresponding boxes without encryption.

[0021] Furthermore, according to the present invention, there isprovided a device to encrypt a multimedia file having a file formatstructure comprising a first box having first box data which stores afirst size field to indicate a size of the first box by bytes, a firsttype field to identify a type of the first box, and multimedia data; and

[0022] a second box having second box data which stores a second sizefield to indicate a size of the second box by bytes, a second type fieldto identify a type of the second box, and information data regardingmultimedia data stored in the second box data,

[0023] the device comprising:

[0024] an encryption section which encrypts the multimedia data to bestored in the first box data to store the encrypted multimedia data inthe first box data and which encrypts the information data to be storedin the second box data to store the encrypted information data in thesecond box data; and

[0025] a file generation section to store the first and second sizefields and the first and second type fields in corresponding boxeswithout encryption.

[0026] Furthermore, according to the present invention, there isprovided a method of decrypting a multimedia file having a file formatstructure comprising a first box having first encrypted box data whichstores a first non-encrypted size field to indicate a size of the firstbox by bytes, a first non-encrypted type field to identify a type of thefirst box, and encrypted multimedia data, and a second box having secondencrypted box data is which stores a second non-encrypted size field toindicate a size of the second box by bytes, a second non-encrypted typefield to identify a type of the second box, and encrypted informationdata regarding multimedia data stored in the second box data,

[0027] the method comprising:

[0028] decrypting the information data to be stored in the second boxdata and holding the decrypted information data as non-encryptedinformation data; and

[0029] decrypting and outputting at least a part of the multimedia datastored in the first box data based on the non-encrypted informationdata.

[0030] Furthermore, according to the present invention, there isprovided a device to decrypt a multimedia file having a file formatstructure comprising a first box having first encrypted box data whichstores a first non-encrypted size field to indicate a size of the firstbox by bytes, a first non-encrypted type field to identify a type of thefirst box, and encrypted multimedia data, and

[0031] a second box having second encrypted box data which stores asecond non-encrypted size field to indicate a size of the second box bybytes, a second non-encrypted type field to identify a type of thesecond box, and encrypted information data regarding multimedia datastored in the second box data,

[0032] the device comprising:

[0033] a decryption section which decrypts the information data to bestored in the second box data;

[0034] a storage section which stores the decrypted information data;and

[0035] an output section which decrypts and outputs at least a part ofthe multimedia data stored in the first box data based on thenon-encrypted information data.

[0036] Additional objects and advantages of the invention will be setforth in the description which follows, and in part will be obvious fromthe description, or may be learned by practice of the invention. Theobjects and advantages of the invention may be realized and obtained bymeans of the instrumentalities and combinations particularly pointed outhereinafter.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING

[0037] The accompanying drawings, which are incorporated in andconstitute a part of the specification, illustrate embodiments of theinvention, and together with the general description given above and thedetailed description of the embodiments given below, serve to explainthe principles of the invention.

[0038]FIG. 1 is a plan view schematically showing a structure of an MP4file to which an encrypting method of an embodiment of the presentinvention is applied.

[0039]FIG. 2 is a plan view schematically showing a general structure ofeach box shown in FIG. 1.

[0040]FIG. 3 is a plan view schematically showing a structure of a boxof another type different from the structure shown in FIG. 2.

[0041]FIG. 4 is a plan view explaining encryption for a top-level boxother than a media data box shown in FIG. 1.

[0042]FIG. 5 is a plan view explaining encryption for a top-level boxother than a media data box of another type different from the structureshown in FIG. 2.

[0043]FIG. 6 is a plan view explaining execution of the encryption by ablock unit for the box shown in FIG. 4 and non-execution of encryptionfor residual data when it is generated.

[0044]FIG. 7 is a plan view showing a header structure of the media databox shown in FIG. 1 and non-encryption thereof.

[0045]FIG. 8 is a plan view showing a structure of a movie box shown inFIG. 1.

[0046]FIGS. 9A and 9B are plan views showing another arrangementstructures of a movie box shown in FIG. 1, respectively.

[0047]FIG. 10 is a plan view explaining a data structure in the mediadata box shown in FIG. 1.

[0048]FIG. 11 is a block diagram schematically showing an encryptionsystem of an embodiment of the present invention.

[0049]FIG. 12 is a flowchart explaining an encrypting method in theencryption system shown in FIG. 11.

[0050]FIG. 13 is another flowchart explaining an encrypting method inthe encryption system shown in FIG. 11.

[0051]FIG. 14 is a plan view showing an example when the media data boxshown in FIG. 1 is encrypted.

[0052]FIG. 15 is a plan view showing another example when the media databox shown in FIG. 1 is encrypted.

[0053]FIG. 16 is a plan view showing yet another example when the mediadata box shown in FIG. 1 is encrypted.

[0054]FIG. 17 is a block diagram schematically showing a decryptionsystem of an embodiment of the present invention.

[0055]FIG. 18 is a flowchart explaining a decrypting method in thedecryption system shown in FIG. 17.

DETAILED DESCRIPTION OF THE INVENTION

[0056] Next, there will be described an encrypting method according toan embodiment of the present invention with reference to theaccompanying drawings.

[0057] The encrypting method according to an embodiment of the presentinvention, which is applied to an MPEG-4 file format, will be describedwith reference to FIGS. 1 to 16.

[0058]FIG. 1 shows a structure of an MPEG-4 file format standardized inaccordance with ISO. In the description hereinafter, the MPEG-4 fileformat is explained by being simply abbreviated to an MP4 file. The MP4file is a file format for storing a video stream or an audio streamencoded in accordance with the MPEG-4. In this file format, other codecstream in addition to specifying on MPEG-4 standard can be also stored.The MP4 data may be stored as a file in a disk or as a binary image in amemory.

[0059] As shown in FIG. 1, the MP4 file has an object structureconstituted of several boxes. It should be noted that the boxes may bereferred to as atoms in some documents. In the MP4 file, storage can becarried out in a nested state where boxes are further inserted into theboxes. Here, a first of the boxes in the nested state which is ahierarchical structure, i.e., an uppermost box, is referred to as atop-level box. In FIG. 1, only the top-level box is shown.

[0060] As shown in FIG. 1, there are several types of top-level boxes.That is, the MP4 file is constituted of a file type box 11, a movie box12, a media data box 13, a movie fragment box 14, a free space box 15, askip box 16 etc. Some of these boxes are essential in the MP4 file,while others may be optionally described.

[0061] In the MP4, it is not necessary to array the boxes in an ordersimilar to that shown in FIG. 1, and the constitution can be changedwithin a range of the foregoing defined items. Here, however,explanation will be omitted particularly regarding specific definedcontents. However, a feature of the MP4 is that a number of the boxeshaving a same type, a position of the box and so on are specified inaccordance with type of the box and constitutions of top-level boxes arevaried in accordance with content data.

[0062] Now, a function of each top-level box will be described. The filetype box stores a type of a file such as a brand or a version of thefile, and describes the file set down by the MP4. The movie box 12stores information or the like necessary for decoding metadata of theentire MP4 data, i.e., an encoded codec data stream of a media, forexample information describing an attribute, an address or the likenecessary for data decoding. The media data box 13 stores an actuallyencoded codec stream of a media, i.e., content data such as a videostream or an audio stream. The movie fragment box 14 stores theinformation of the movie box 12 in a divided manner. The free space box15 and the skip box 16 store padding data for padding. The user data box17 store user defied data.

[0063] Next, a box structure will be described. All the boxes havecommon structures. FIG. 2 shows a box 20 having a common structure. Inthe box 20, first 4 bytes are set in a size field 21 for indicating asize of a box by bytes. Next 4 bytes are set in a type field 22 foridentifying a type of the box. The type of the box is identified by fourcharacters. For example, “moov” is set in the case of the movie box 12,and “mdat” is set in the case of the movie data box. By matching thesefour characters, the type of the box can be identified. Then, after thetype field 22, a box data field or section 23 is stored. A structure ofthis box data field has a syntax defined in each box in accordance witha purpose. A size of the box data field is set to a value obtained bysubtracting 8 used in the size field 21 and the type field 22 from avalue of the size field 21.

[0064] As shown in FIG. 3, when a value of the size field is 1 (Size=1),in this box 20, the size field 21 and a large size field 24 of 8 bytesindicating a size of a box appear between the type field 22 and the boxdata field 23 so that even a large capacity box which size cannot berepresented by the size field 21 of 4 bytes can be dealt with. In thisbox 20, a size of the box data field 23 is set to a value obtained bysubtracting 16 from a size stored in the large size field.

[0065] According to the encrypting method of the embodiment of thepresent invention, data encryption or non-encryption is decided for eachtop-level box. That is, as shown in FIG. 4, if a value of the size field24 is not 1 (size!=1), data of the size field and the type field are notencrypted (may be referred to as non-encryption if data is not simplyencrypted, hereinafter), and the box data is targeted for encryption.

[0066] The media data in the media data box 13 are mandatory encryptedas described later. Box data in the other boxes 11, 12, 14, 15, 16 maybe encrypted or not be encrypted as described later.

[0067] As shown in FIG. 5, even if a value of the size field 24 is 1 andthe large size field 24 is present between the type field 22 and the boxdata field 23, this large size field 24 is not targeted for encryption,either. That is, according to the encrypting method of the embodiment ofthe present invention, only the box data in the box data field 23 istargeted for encryption. In a certain encrypting method, a block lengthof data may require a plurality of bytes. In other words, if datatargeted for encryption is divided by a predetermined block length to beencrypted, residual data less than the predetermined block length isgenerated, creating a possibility that this data length will not reachthe number of bytes necessary for encryption. If residual bytes aregenerated in the encrypted data, and the number of the bytes is smallerthan the number of bytes targeted for encryption, as shown in FIG. 6,the residual data in this residual block may not be encrypted. Anexample is a case where a box data length is 15 bytes and an encryptingmethod needs 8 bytes for a block length of data. In this case, first 8bytes of the box data are encrypted, while remaining 7 bytes are not.

[0068] As described above, by encrypting the data for the box data, forexample when access is tried to the movie box 12, first 8 bytes of theMP4 data are first acquired to obtain a box size and a box type field.Then, checking is carried out on coincidence of the box type with a typeof the movie box 12. In case of non-coincidence, that is, if the boxtype is not a type of the movie box 12, an access pointer is shifted byan amount equal to the box size, and next 8 bytes are acquired to obtaina box size and a box type field. This access pointer shifting isrepeated until the box type coincides with the type of the movie box 12.When the box type coincides with the type of the movie box 12, theencrypted box data are sequentially decrypted to enable access to thebox data in the movie box 12.

[0069] Next, encryption of media data in the media data box 13 will bedescribed.

[0070] Different from the other top-level boxes which store informationnecessary for decoding media data streams, the media data box 13 storesmedia data. Encryption of this media data requires a capability ofefficiently accessing an optional position of the media data duringspecial reproduction such as skip reproduction, fast-forwardreproduction, rewind reproduction or resume reproduction. Thus, as shownin FIG. 6, in addition to non-encryption of the size field and the typefield, the stream data are subjected to encryption by each independentencoded unit. In this case, a sample or a frame is equivalent to theencoded unit for an audio stream, and a frame is equivalent to theencoded unit for a moving picture stream.

[0071] In the encryption of the media data in the media data box 13 ofthe embodiment of the present invention, for an encoded unit to beencrypted, a sample in the MP4 data is a target. In stead of the sample,a chunk may be encrypted in the media data box. A position of eachsample in the MP4 data can be obtained by analyzing a chunk offset and asample size of the movie box 12 describing the sample. That is, aposition of the chunk to which the sample belongs is described as anoffset from a head of a data file in the chunk offset, and a size of thesample included in the chunk is described in the sample size.Accordingly, offsets of all the samples can be obtained by referring tothe chunk offset and the sample size.

[0072] To provide more clear explanation, description will be made of astructure of the movie box 12 and a data structure in the media data box13 in the MP4 by referring to FIG. 8 to FIG. 10

[0073]FIG. 8 shows the structure of the movie box 12 referred to as moov(Movie Box). In the box of FIG. 8, only the movie box 12 (Movie Box)equivalent to a data box portion targeted for encryption is shown whilethe size field, the large size field and the type field not targeted forencryption which are described above with reference to FIGS. 4 to 8 arenot shown. Similarly in FIG. 8, mdat is shown as the media data box 13,in which the size field, the type field and the large size field arepresent and content data (multimedia data) as real data is stored as boxdata. In the description of FIGS. 8, 9A and 9B, it should be understoodthat there are a size field, a type field and a large size field.

[0074] In a format shown in FIG. 8, one MP4 file is constituted of moov(Movie Box) describing file information as a header of a first layer,and mdat (Media Data Box 13) storing multimedia data containing audiodata and video data. In this MP4 file, free as a free space of the firstlayer, skip and udta (User Data Box) permitting writing defined by auser are additionally disposed.

[0075] In the MP4 file, data are generally classified based on unitscalled boxes to be managed. These boxes can take a hierarchicalstructure from a top layer to a bottom layer, and a box furtherincluding a lower layer therein is referred to as a “container box”. Theboxes described here may be referred to as atoms.

[0076] The moov (Movie Box) as the header includes mvhd (Movie HeaderBox) describing making time of the MP4 file on a second layer and headerinformation such as a content of the MP4 file, and an object, i.e., iods(Object Descriptor Box) describing information regarding a reproductiontarget and track (Tack Box) describing various parameters regardingmultiplexed media information. If there are many multiplexed media, thenumber of tracks (Track Boxes) corresponding to the number of the mediais prepared. For example, in a content multiplexing a sound and a video,an audio media track and a video media track are prepared. A parameterof an audio media is stored in the audio track, and a parameter of avideo media is stored in the video track.

[0077] As shown in FIG. 8, the track (Track Box) includes tkhd (TrackHeader Box) storing making time of a track on a third layer and a seriesof numbers called track ID's (identifiers) for identifying tracks, tref(Track Reference Box) having description regarding a track, edts (EditBox) regarding edit information, and mdia (Media Box) having descriptionregarding media information. The edit box edts includes elst (Edit ListBox) describing edit list information on a fourth layer. The media boxmdia includes mdhd (Media Header) storing information regarding a timescale or the like of the media track, hdlr (Handler Reference Box)describing information for reference to the header, and minf (MediaInformation Box) storing information regarding media on the fourthlayer. The media information box minf includes vmhd (Video Media HeaderBox) indicating that a media stored in a track is a video or smhd (SoundMedia Header Box) indicating the a media stored in a track is a sound,hmhd (Hint Media Header Box) describing header information of a hintmedia, mpeg (MPEG-4 Media Box) describing header information of theMPEG-4 if a media is an MPEG-4 stream other than a video or a sound,minf (Medial Information Box) describing media information, and stbl(Sample Table Box) describing information regarding a sample on a fifthlayer. In the video media box vmhd and the sound header box smhd,descriptions are made alternately in accordance with a media stored inthe track, i.e., a type: a sound or a video. Further, the dinf (DataInformation Box) includes dref (Data Reference Box) describinginformation for reference to data. The stbl (Sample Table Box) includesstts (Decoding Time to Sample Box) setting decoding time of each sample,ctts (Composition Time to Sample Box) describing indication and time ofa sample, stss (Sync Sample Box) describing synchronization informationof a sample, stsd (Sample Description Box) setting a type of a codec ora variety of information necessary for decoding, stsz (Sample Size Box)setting the total number of samples in a track (sample_count) and a datasize of each sample (entry_size), stsc (Sample to Chunk Box) describingthe number of samples in a chunk (sample_per_chunk) and an index of asample (sample_description_index), stco (Chunk Offset Box) describingoffset position information from a head of a file regarding a chunk(chunk_offset), stsh (Shadow Sync Sample Box) describing synchronizationinformation, and stdp (Degradation Priority Box). If necessary, aplurality of stsd (Sample Description Box) can be set.

[0078] In the described case, as shown in FIG. 10, a sample is a unit ofcertain sizes into which an actual media such as a video or a sound isdivided. Media data is managed based on this sample. A chunk is aconcatenation of one or a plurality of samples. Information regarding aninternal structure of a data area such as a chunk position from the headof the file or the number of samples included in the chunk is describedin the lower layer of the moov container box as described above. Asdescribed above, the actual media data is arranged in the mdat box, anda box called a track is allocated to information management for eachmedia such as a sound or a video. Thus, in the MP4 file, by obtainingthe moov container box, the number of constituting media, types, datasizes etc. can be discovered.

[0079] Generally, for the boxes of MP4, there are no rules about anarrangement order on the same layer. On the first layer of FIG. 8, themoov, the mdat, the moof, the free, the skip and the udta are arrayed inthis order. However, it does not mean that the boxes must always bearrayed in this order from the head of the file. That is, on the firstlayer arrangement, the mdat, the moov, the free, the skip and the udtamay be arrayed in this order, as shown in FIG. 9A or the moov, the udta,the mdat, the moot, the mdat, the skip and may be arrayed in this order,as shown in FIG. 9A. Furthermore, in the MP4 file, there is providedonly one bock of moov but may be provided a plurality of blockscorresponding to mdat and/or moof.

[0080] The data in the moov container box excluding the size field, andthe type field shown in FIG. 6 or excluding the size field, the typefield and the large size field shown in FIG. 7 are encrypted. Similarly,real stream data in the mdat container box excluding the size field, thetype field and the large size field are encrypted. The MP4 file may haveonly one mdat but a plurality of the other boxes corresponding to themdat and/or moof.

[0081] Such encryption is realized by a moving picture recording system100 similar to that shown in FIG. 11 as an example. In the movingpicture recording system 100, audio and video data are encrypted in anorder similar to that shown in FIG. 11. Now, a format process includingencryption in the moving picture recording system 100 will be describedby referring to FIGS. 11 and 12.

[0082] An audio signal captured from a microphone 101 or an audio inputdevice, is encoded by an audio encoder 102, and converted into encodedaudio data, for example MPEG-4 audio data. Similarly, an audio signalcaptured from a camera 103 or a video input device, is encoded by avideo encoder 104, and converted into encoded video data, for exampleMPEG-4 video data. Here, both analog and digital signals may be inputtedfrom the microphone 101 and the camera 103 as external input devices tothe moving picture recording system 100. From the audio encoder 102, anaudio encoded stream generated therein is outputted to a file generationsection 105. Similarly, from the video encoder 104, a video encodedstream generated therein is outputted to the file generation section105. At the file generation section 105, the audio encoded stream andthe video encoded stream outputted from the audio encoder 102 and thevideo encoder 104 are adjusted in a predetermined MP4 file formatsimilar to that shown in FIG. 8, and developed in a local memory 106.After completion of the file generation, as described by referring toFIGS. 12 and 13, at an encryption section 107, the file stored in thelocal memory 106 is encrypted by a predetermined encrypting method,rearranged in the local memory 106, and outputted as an encrypted file.

[0083] Upon a start of encryption (step S10), the movie box 12 (moov) issearched in the MP4 file stored in the local memory 106 as shown in stepS11. Here, as the movie box 12 is a top-level box, a size field and atype field are read from the head of the file, and a box having a typefield set as moov is searched. If a first box is not moov, seeking iscarried out by an amount equal to a read size, and a next box isanalyzed. The search is continued until a type field indicated as moovis found.

[0084] After the detection of the movie box 12, a chunk offset box(stco), a sample to chunk box (stsc) and a sample size box (stsz) storedfor each track in the movie box 12 are searched, and tables held thereinare saved in the memory. That is, in step S12, an initial value of N isset to 1, a chunk offset stco of a first chunk described in a firsttrack trak in the movie box 12 is read. An offset address is read fromchunk_offset in the chunk offset stco, and all sample sizes belonging tothe track are read from entry_size of the sample size box stsz.Additionally, the number of all chunks in the track is read fromentry_count in the chunk offset stco, the number of samples of eachchunk is read from sample per_chunk of stsc meaning samples for thechunk box, and the total number of all samples in the track is read fromsample_count of the sample size box.

[0085] Similarly for other tracks, similar items are read. From theseread items, a table describing an offset of each chunk and an offset ofeach sample in an offset order is made.

[0086] That is, as shown in FIG. 10, in the media data stored in themedia data box 13 where an audio chunk (A chunk) belonging to the audiotrack and a video chunk (V chunk) belonging to the video trackalternatively appear, a table is made regarding a chunk indicated froman offset 0 to an offset x, and an offset address of each chunk iscopied in the table from chunk_offset. In the table, sample items aremade in accordance with the number of samples constituting each chunk,and a position and a size of a sample are described from a sample sizeof the relevant sample. In the table that has been made, the totalnumber of chunks and the total number of samples are checked based onthe number of chunks and the number of samples of each track.

[0087] Then, by referring to the table, a first sample in the media databox 13 is encrypted to be written in the local memory 106 as shown instep S13. Then, checking is carried out on whether a number N of theencrypted sample is a last sample or not in the media data box 13 instep S13. If the encrypted sample is not a last sample, a sample numberto be encrypted is incremented by 1 as shown in step S14. The processreturns to step S12 to obtain a position and a size of a sample from thetable again, and this sample is encrypted in step S13. The process fromstep S12 to S15 is repeated and, if the encrypted sample is equivalentto a last sample in the media data box (mdat) 13, the process isfinished as shown in step S18.

[0088] If the other boxes other than the media box 13 are subjected toencryption, the process from step S12 to S15 is repeated, as shown inFIG. 13, in a same manner as that in FIG. 12. If the encrypted sample isequivalent to a last sample in the media data box (mdat) 13, another boxis encrypted in step S16 understanding that encryption of real data inthe media data box 13 is finished. For example, the movie box 12 (moov)used for encrypting the real data in the media data box 13 is encrypted.Needless to say, none of the size field, the type field and the largesize field in the media data box 13 and the movie box 12 (moov) areencrypted as described above.

[0089] In step S17, if none of the boxes are encrypted, the processreturns to step S16 to sequentially encrypt the boxes in the MP4 file.

[0090] In step S17, if encryption of all the boxes is finished, theprocess is finished as shown in step S18.

[0091] In the foregoing description, in the media data box 13, thesample is encrypted for each predetermined block length. If a residualportion is generated, this portion is not encrypted. For example, if apredetermined block length is 8 bytes and a sample has a size of N byteswhich is an integral (n) multiple of 8 bytes (N=n×8), the sample isencrypted without any non-encrypted residuals as shown in FIG. 14. Onthe other hand, if a predetermined block length is 8 bytes and a samplehas bytes exceeding the size of the integral (n) multiple of 8 bytes(N=n×8+m, m<8), as shown in FIG. 15, a portion of the sample which isthe integral multiple of 8 bytes of the predetermined block length isencrypted while remaining portions (m bytes) are not encrypted.Similarly, if a predetermined block length is 8 bytes and a sample hasbytes within the size of the integral (n) multiple of 8 bytes (N<8), asshown in FIG. 16, the sample is not encrypted.

[0092] In the encryption process described above with reference to FIG.13, it is assumed that the MP4 file has been stored in the local memory106, i.e., file generation has been completed. However, obviously, theencryption process can be executed while the file is generated.

[0093] The file containing the encrypted audio and video data isdecrypted, for example by a moving picture reproduction system 200similar to that shown in FIG. 17. The decryption in this moving picturereproduction system 200 is realized by a process shown in FIG. 18. Now,the decryption process in the moving picture reproduction system 200will be described by referring to FIGS. 17 and 18.

[0094]FIG. 17 shows the moving picture reproduction system 200 fordecrypting the encrypted audio and video data of the MP4 file andconverting the data into audio and video signals. In the moving picturereproduction system 200, the encrypted MP4 file having an encryptiondata which is encrypted in the process shown in FIG. 13 is inputted to alocal memory 206 to be stored therein. As described with reference toFIG. 18, the encrypted file is decrypted by a predetermined decryptingmethod at a decryption section 207, and rearranged in the local memory206. The file developed in the local memory is separated into an audioencoded stream and a video encoded stream at a file analysis section205, which are respectively supplied to an audio decoder 202 and a videodecoder 204. The audio decoder 202 decodes the supplied audio encodedstream, and outputs the audio signal to a speaker 201 to be reproduced.The video decoder 204 decodes the supplied video encoded stream, andoutputs the video signal to an image output device 203 to display amoving picture thereon.

[0095] The process of decrypting the encrypted file will be described byreferring to FIG. 18. It is assumed herein that the encrypted MP4 filehas been stored in the local memory 206 and encrypted for each sample inthe media data box 13.

[0096] Upon a start of a decryption process (step S20), decryption iscarried out for the boxes other than the media data box 13 (mdat) asshown in step S21. As described above with reference to FIGS. 4 to 7, ineach box, none of the size field, the type field and the large sizefield are encrypted. Accordingly, by referring to these fields, theboxes other than the media data box (mdat) 13 are checked, and anencrypted box data portion of each box is decrypted. The decrypted boxis stored again in the local memory 206. As shown in step S22, theprocess is repeated until the decryption of the boxes other than themedia data box (mdat) 13 is finished. Upon the end of this processing,the process moves to next processing shown in step S23.

[0097] If only the media data box 13 is subjected to encryption and theother boxes are not subjected to encryption, step S23 is started afterthe start step S20.

[0098] In step S23, the decrypted movie box 12 is searched in the file.After the movie box 12 has been searched, as shown in step S24, by amethod similar to that of the encryption, a chunk offset box (stco), asample to chunk box (stsc) and a sample size box (stsz) stored for eachtrack in the movie box 12 are searched, and tables held therein are heldin the memory. That is, in step S24, an initial value of N is set to 1,a chunk offset stco of a first chunk described in a first track trak inthe movie box 12 is read. An offset address is read from chunk_offset inthe chunk offset stco, and all sample sizes belonging to the track areread from entry_size of the sample size box stsz. Additionally, thenumber of all chunks in the track is read from entry_count in the chunkoffset stco, the number of samples of each chunk is read fromsample_per_chunk of stsc meaning samples for the sample to chunk box,and the total number of all samples in the track is read fromsample_count of the sample size box.

[0099] Similarly for other tracks, similar items are read. From theseread items, a table describing an offset of each chunk and an offset ofeach sample in an offset order is made.

[0100] That is, as shown in FIG. 10, in the media data stored in themedia data box 13 where an audio chunk (A chunk) belonging to the audiotrack and a video chunk (V chunk) belonging to the video trackalternatively appear, a table is made regarding a chunk indicated froman offset 0 to an offset x, and an offset address of each chunk iscopied in the table from chunk_offset. In the table, sample items aremade in accordance with the number of samples constituting each chunk,and a position and a size of a sample are described from a sample sizeof the relevant sample. In the table that has been made, the totalnumber of chunks and the total number of samples are checked based onthe number of chunks and the number of samples of each track.

[0101] Then, by referring to the table, a first sample is decrypted tobe written in the local memory 206 as shown in step S25. Then, checkingis carried out on whether a number N of the decrypted sample is a lastsample or not in the media data box 13 in step S26. If the decryptedsample is not a last sample, a sample number to be decrypted isincremented by 1 as shown in step S27. As shown in step S24, the processreturns to the step of obtaining a position and a size of a sample fromthe table again, and this sample is decrypted in step S25. The processfrom step S24 to S27 is repeated and, if the decrypted sample isequivalent to a last sample in the media data box (mdat) 13, decryptionof real data in the media data box 13 is finished.

[0102] As a modified example of the foregoing embodiment, an offset ofeach sample may be obtained by referring to the movie fragment box. Thatis, in the MP4 file where the movie fragment box is present, trackfragment run box which has similar function as a chunk offset box (stco)and a sample size box (stsz) is described in the movie fragment box.Thus, an offset of each sample can be similarly obtained by analyzingthe chunk offset stco and the sample size stsz.

[0103] In the foregoing embodiment, the data in the sample is encryptedby using the offset value and the size of the sample. Since the sampleis a minimum unit necessary for decoding the encoded stream, if accesscan be made to the sample unit, it is possible to efficiently access asample in an optional position in the aforementioned specialreproduction. That is, in the process shown in FIG. 13, steps S10 to S12are carried out. In step S12, if an N-th sample is a target sample, onlythe target sample is decrypted. This decrypted sample is decoded into anaudio or video signal to be reproduced. By the reproduction of only thetarget sample, in moving picture reproduction, for example, fast-forwardreproduction, rewind reproduction, random access reproduction, andresume reproduction, i.e., reproduction is resumed from where thereproduction is stopped by the user, are realized. Similar reproductionis enabled for a sound.

[0104] In the foregoing embodiment, encryption is carried out for eachsample in the media data box 13. In place of the encryption for eachsample, the data in the chunk maybe encrypted for each chunk. Asdescribed above, the chunk is a collection of the continuous samples ofthe same media in the media data. Encryption for each chunk is onlynecessary as in the case of the encryption for each sample unit. In thisencryption for each chunk, since the number of times of resettingencryption is reduced compared with the encryption for each sample, theprocess of encryption and decryption can be reduced. In the encryptionand the decryption for each chunk, in FIGS. 11 and 16, by processingcollected chunk information, chunk encryption and description areenabled as in the case of the sample.

[0105] The encrypting method and the decrypting method of the presentinvention can be applied to equipments for storing the MP4 file formatsuch as a mobile phone, a digital camera, a digital movie cam-coder, adigital hard disk recorder, PDA (Personal Digital Assistant) etc.

[0106] Further, even in a JPEG 2000 file format using a similar boxstructure, the encrypting method and the decrypting method of thepresent invention can be applied.

[0107] As described above, according to the embodiment of the presentinvention, by carrying out encryption for each box, it is possible toefficiently access an optional box present in the MP4 data. Moreover, bycarrying out encryption for data other than the size field and the typefield, it is possible to access a desired box without any decryptionusing the size field and the type field which are plaintexts.

[0108] Furthermore, according to the embodiment of the presentinvention, it is possible to access a box including audio or movingpicture encoded data, efficiently access a sample or a chunk in the box,and realize special audio or moving picture reproduction.

[0109] Additional advantages and modifications will readily occur tothose skilled in the art. Therefore, the invention in its broaderaspects is not limited to the specific details and representativeembodiments shown and described herein. Accordingly, variousmodifications may be made without departing from the spirit or scope ofthe general invention concept as defined by the appended claims andtheir equivalents.

What is claimed is:
 1. A data structure of a multimedia file formatcomprising: a first box including a first box data field configured tostore encrypted multimedia data, a first size field configured todescribe non-encrypted size information indicating a size of the firstbox by bytes, and a first type field configured to describenon-encrypted type information identifying a type of the first box; anda second box including a second box data field configured to describeinformation data regarding the encrypted multimedia data stored in thefirst box data, a second size field configured to describe non-encryptedsize information indicating a size of the second box by bytes, and asecond type field configured to describe non-encrypted type informationidentifying a type of the second box.
 2. The data structure of themultimedia file format according to claim 1, wherein the first or secondboxes includes first and second large size fields to describe secondnon-encrypted size information indicating sizes of the boxes by bytestogether with the size fields if the first or second size fields havepredetermined values, respectively.
 3. The data structure of themultimedia file format according to claim 1, wherein the multimedia dataincludes audio or moving picture, the audio or moving picture encodeddata are stored as an arrangement of data samples in the first box datafield, and the data sample includes encrypted and encoded data.
 4. Thedata structure of a multimedia file format according to claim 3, whereinthe encrypted and encoded data is the data arrangement encrypted basedon a unit of a block set as a predetermined data length, and anon-encrypted and encoded data arrangement is received in the sample byusing the block as a reference.
 5. The data structure of the multimediafile format according to claim 1, wherein the multimedia data includesaudio or moving picture, the audio or moving picture encoded data arestored as an arrangement of data samples in the first box data field,one or a plurality of samples are set in a chunk, and the chunk includesencrypted and encoded data.
 6. The data structure of a multimedia fileformat according to claim 5, wherein the encrypted and encoded data isthe data arrangement encrypted based on a unit of a block set as apredetermined data length, and a non-encrypted and encoded dataarrangement is received in the chunk by using the block as a reference.7. The data structure of a multimedia file format according to claim 1,wherein the information data in the second box data field is encrypted.8. A method of encrypting a multimedia file having a file formatstructure which comprises a first box including a first box data field,a first size field, and a first type field, and a second box including asecond data field, a second size field, and a second type field, themethod comprising: encrypting multimedia data and storing the encryptedmultimedia data in the first box data field; storing the informationdata describing information data regarding the encrypted multimedia datain the second box data field; storing first and second non-encryptedsize information describing non-encrypted size information indicating asize of the first and second boxes by bytes in the first and second sizefields without encryption; and storing first and second non-encryptedtype information describing first and second non-encrypted typeinformation identifying a type of the first and second boxes in thefirst and second type fields without encryption.
 9. The method accordingto claim 8, wherein the multimedia data includes audio or movingpicture, the audio or moving picture encoded data is separated in thedata units which corresponds to an arrangement of data samples and thedata sample is stored in the first box data field, each of the datasample being encrypted and encoded.
 10. The method according to claim 8,wherein the encrypted and encoded data is separated in a dataarrangement of the data units, which corresponds a block set having apredetermined data length, and the data arrangement is received thesample by using the block as a reference.
 11. The method according toclaim 8, wherein the multimedia data includes audio or moving picture,the audio or moving picture encoded data unit corresponds to anarrangement of data samples in the first box data field, one or aplurality of samples are received in a chunk, and the chunk beingencrypted.
 12. The method according to one of claim 10, wherein theencoded data are separated in data arrangements encrypted based on aunit of a block having a predetermined data length, and the encoded dataarrangement is received in the chunk by using the block as a reference.13. The method according to one of claim 8, wherein the storing theinformation data includes encrypting the information data to store theencrypted information in the second box data field.
 14. An apparatus forencrypting a multimedia file having a file format structure whichcomprises a first box including a first box data field, a first sizefield, and a first type field, and a second box including a second datafield, a second size field, and a second type field, the apparatuscomprising: encrypting module configured to encrypt multimedia data andstore the encrypted multimedia data in the first box data field; firststoring module configured to store the information data describinginformation data regarding the encrypted multimedia data in the secondbox data field; second storing module configured to store first andsecond non-encrypted size information describing non-encrypted sizeinformation indicating a size of the first and second boxes by bytes inthe first and second size fields without encryption; and third storingmodule configured to store first and second non-encrypted typeinformation describing first and second non-encrypted type informationidentifying a type of the first and second boxes in the first and secondtype fields without encryption.
 15. The apparatus according to claim 13,wherein the multimedia data includes audio or moving picture, theencrypting module separates the audio or moving picture encoded data inthe data units which corresponds to an arrangement of data samples andstore the data sample in the first box data field, each of the datasample being encrypted and encoded.
 16. The apparatus according to claim13, wherein the encrypting module separates the encrypted and encodeddata in a data arrangement of the data units, which corresponds a blockset having a predetermined data length, the data arrangement beingreceived in the sample by using the block as a reference.
 17. Theapparatus according to claim 13, wherein the multimedia data includesaudio or moving picture, the audio or moving picture encoded data unitcorresponds to an arrangement of data samples in the first box datafield, one or a plurality of samples being received in a chunk, and thechunk being encrypted.
 18. The apparatus according to one of claim 17,wherein the encrypting module separates the encoded data in dataarrangements encrypted based on a unit of a block having a predetermineddata length, and the encoded data arrangement is received in the chunkby using the block as a reference.
 19. A method of decrypting amultimedia file having a file format structure which comprising a firstbox including a first box data field configured to store data units ofencrypted multimedia data, a first size field configured to describefirst non-encrypted size information indicating a size of the first boxby bytes, and a first type field configured to describe firstnon-encrypted type information identifying a type of the first box, anda second box including a second box data field configured to describeinformation data regarding the encrypted multimedia data stored in thefirst box data, a second size field configured to describe secondnon-encrypted size information indicating a size of the second box bybytes, and a second type field configured to describe secondnon-encrypted type information identifying a type of the second box, themethod comprising: obtaining the information data stored in the secondbox data and holding the information data; and decrypting each unit ofthe multimedia data stored in the first box data based on theinformation data, and outputting the decrypted data units, sequentially.20. An apparatus for decrypting a multimedia file having a file formatstructure which comprising a first box including a first box data fieldconfigured to store data units of encrypted multimedia data, a firstsize field configured to describe non-encrypted size informationindicating a size of the first box by bytes, and a first type fieldconfigured to describe non-encrypted type information identifying a typeof the first box, and a second box including a second box data fieldconfigured to describe information data regarding the encryptedmultimedia data stored in the first box data, a second size fieldconfigured to describe second non-encrypted size information indicatinga size of the second box by bytes, and a second type field configured todescribe second non-encrypted type information identifying a type of thesecond box, the apparatus comprising: receiving module configured toreceive the information data stored in the second box data and holdingthe information data; and decrypting module configured to decrypt eachof the multimedia data units stored in the first box data based on theinformation data, and outputting the decrypted data units, sequentially.21. A data structure of a multimedia file format comprising: a mediadata box including a media data box data field configured to storeencrypted multimedia data, a first size field configured to describefirst non-encrypted size information indicating a size of the media databox by bytes, and a first type field configured to describe firstnon-encrypted type information identifying a type of the media data box;and a movie box including a movie box data field configured to describeinformation data regarding the encrypted multimedia data stored in themedia box data, including a second size field configured to describesecond non-encrypted size information indicating a size of the movie boxby bytes, and a second type field configured to describe secondnon-encrypted type information identifying a type of the movie box. 22.The data structure of the multimedia file format according to claim 21,wherein, in the media box, audio or moving picture encoded data arestored as an arrangement of data samples in the media data box datafield, and the data sample includes encrypted and encoded data.
 23. Thedata structure of a multimedia file format according to one of claim 22,wherein the encrypted and encoded data are the data arrangementencrypted based on a unit of a block set as a predetermined data length,and a non-encrypted and encoded data arrangement is received in thesample by using the block as a reference.
 24. The data structure of themultimedia file format according to claim 21, wherein, in the media databox, audio or moving picture encoded data are stored as an arrangementof data samples in the media box, one or a plurality of samples are setin a chunk, and the chunk includes encrypted and encoded data.
 25. Thedata structure of a multimedia file format according to claim 24,wherein the encoded multimedia data corresponds to a data arrangementencrypted based on a unit of a block set as a predetermined data length,and a non-encrypted and encoded data arrangement is received in thechunk by using the block as a reference.
 26. The data structure of amultimedia file format according to claim 21, wherein the informationdata in the movie box data field is encrypted.